There are n cities. Some of them are connected, while some are not. If city a is connected directly with city b, and city b is connected directly with city c, then city a is connected indirectly with city c.
A province is a group of directly or indirectly connected cities and no other cities outside of the group.
You are given an n x n matrix isConnected where isConnected[i][j] = 1 if the ith city and the jth city are directly connected, and isConnected[i][j] = 0 otherwise.
Return the total number of provinces.
Example 1:
Input: isConnected = [[1,1,0],[1,1,0],[0,0,1]] Output: 2
Example 2:
Input: isConnected = [[1,0,0],[0,1,0],[0,0,1]] Output: 3
Constraints:
1 <= n <= 200n == isConnected.lengthn == isConnected[i].lengthisConnected[i][j]is1or0.isConnected[i][i] == 1isConnected[i][j] == isConnected[j][i]
Average Rating: 3.76 (66 votes)
Solution
Approach #1 Using Depth First Search[Accepted]
Algorithm
The given matrix can be viewed as the Adjacency Matrix of a graph. By viewing the matrix in such a manner, our problem reduces to the problem of finding the number of connected components in an undirected graph. In order to understand the above statement, consider the example matrix below:
M= [1 1 0 0 0 0
1 1 0 0 0 0
0 0 1 1 1 0
0 0 1 1 0 0
0 0 1 0 1 0
0 0 0 0 0 1]
If we view this matrix M as the adjancency matrix of a graph, the following graph is formed:
In this graph, the node numbers represent the indices in the matrix M and an edge exists between the nodes numbered i and j, if there is a 1 at the corresponding M[i][j].
In order to find the number of connected components in an undirected graph, one of the simplest methods is to make use of Depth First Search starting from every node. We make use of visited array of size N(M is of size NxN). This visited[i] element is used to indicate that the ith node has already been visited while undergoing a Depth First Search from some node.
To undergo DFS, we pick up a node and visit all its directly connected nodes. But, as soon as we visit any of those nodes, we recursively apply the same process to them as well. Thus, we try to go as deeper into the levels of the graph as possible starting from a current node first, leaving the other direct neighbour nodes to be visited later on.
The depth first search for an arbitrary graph is shown below:
From the graph, we can see that the components which are connected can be reached starting from any single node of the connected group. Thus, to find the number of connected components, we start from every node which isn't visited right now and apply DFS starting with it. We increment the count of connected components for every new starting node.
Complexity Analysis
-
Time complexity : O(n2). The complete matrix of size n2 is traversed.
-
Space complexity : O(n). visited array of size n is used.
Approach #2 Using Breadth First Search[Accepted]
Algorithm
As discussed in the above method, if we view the given matrix as an adjacency matrix of a graph, we can use graph algorithms easily to find the number of connected components. This approach makes use of Breadth First Search for a graph.
In case of Breadth First Search, we start from a particular node and visit all its directly connected nodes first. After all the direct neighbours have been visited, we apply the same process to the neighbour nodes as well. Thus, we exhaust the nodes of a graph on a level by level basis. An example of Breadth First Search is shown below:
In this case also, we apply BFS starting from one of the nodes. We make use of a visited array to keep a track of the already visited nodes. We increment the count of connected components whenever we need to start off with a new node as the root node for applying BFS which hasn't been already visited.
Complexity Analysis
-
Time complexity : O(n2). The complete matrix of size n2 is traversed.
-
Space complexity : O(n). A queue and visited array of size n is used.
Approach #3 Using Union-Find Method[Accepted]
Algorithm
Another method that can be used to determine the number of connected components in a graph is the union find method. The method is simple.
We make use of a parent array of size N. We traverse over all the nodes of the graph. For every node traversed, we traverse over all the nodes directly connected to it and assign them to a single group which is represented by their parent node. This process is called forming a union. Every group has a single parent node, whose own parent is given by -1.
For every new pair of nodes found, we look for the parents of both the nodes. If the parents nodes are the same, it indicates that they have already been united into the same group. If the parent nodes differ, it means they are yet to be united. Thus, for the pair of nodes (x,y), while forming the union, we assign parent[parent[x]]=parent[y], which ultimately combines them into the same group.
The following animation depicts the process for a simple matrix:
At the end, we find the number of groups, or the number of parent nodes. Such nodes have their parents indicated by a -1. This gives us the required count.
Complexity Analysis
- Time complexity : O(n3). We traverse over the complete matrix once. Union and find operations take O(n) time in the worst case.
- Space complexity : O(n). parent array of size n is used.
Last Edit: October 23, 2018 12:52 PM
I think for Union-Find approach, the complexity could be O(n^2) if path compression is used. Union-Find complexity with path compression is O(m), which m is operations (either union/find); in this case, since m[i][j] = m[j][i], the union operation is at most n*(n-1)/2, therefore it is O(n^2)
Last Edit: February 22, 2019 8:05 PM
If use path compression and union by rank to optimize union find solution, what time complexity will it be?
As big O of union operation is determined by find operation, and find has amortized time O(a(n)) where a(n) < 5 in practical. So total time will be O(n^2) IMO.
July 29, 2020 7:25 AM
This article needs more explanation for the code logic. Don't just assume people understand what the code does after explaining the higher level algorithm. Please explain more in detail to make the code more digestible! Thank you!
Last Edit: November 19, 2020 1:25 AM
There is one small caveat of BFS solution here.
In general, one should mark a node as visited when it is discovered. Otherwise, we may end up with duplicate in the queue. For example, start from node 0, which is connected to node 1 and 2. If node 3 is connected to both 1 and 2, then we may add node 3 twice into the queue. Of course, the final result is the still the same, but we end up with doing unnecessary work, which could scale up with the number of edges.
A sample code in python for your reference:
def findCircleNum(self, M: List[List[int]]) -> int:
n = len(M)
visited = [False] * n
cnt = 0
q = deque()
for i in range(n):
if not visited[i]:
visited[i] = True # this setup assume the start node is visited
q.append(i)
while q:
s = q.popleft()
for j in range(n):
if M[s][j] == 1 and not visited[j]:
visited[j] = True # mark visited at discovery
q.append(j)
cnt += 1
return cnt
EDIT: If you're interested about the state such as discovery, please check Introduction to algorithm, 3ed, chapter 22, where a node is color coded as WHITE (undiscovered), GREY (discovered but not processed), and BLACK (processed).
Last Edit: September 26, 2018 12:03 PM
For the union-find complexity analysis: If we traverse over the complete matrix O(n^2) and union/find operations take O(n), wouldn't the complexity be O(n^3)?
February 11, 2021 10:11 AM
This is basically the same question as number of islands. I liked it though.
Last Edit: June 25, 2019 7:34 AM
-
We can simplify/optimize code by using
boolean[] visited
instead of
int[] visited -
Also, for the DFS solution it's better to mark a student as visited right after we got into the dfs method, otherwise we make unnecessary dfs calls on M[0][0], M[1][1], etc..
Last Edit: May 5, 2020 8:57 AM
the position of visited.append() in Solution 1 is not ideal, it causes confusion and some extra visits. A more clear version of solution 1:
"""
class Solution:
def dsf(self, r, grid, visited):
visited.append(r)
for j in range(len(grid)):
if grid[r][j] == 1 and j not in visited:
self.dsf(j, grid, visited)
def findCircleNum(self, M) -> int:
visited = []
count = 0
for row in range(len(M)):
if row not in visited:
self.dsf(row, M, visited)
count += 1
return count
"""
April 4, 2017 12:06 AM
Agree with tiny_code, time complexity should O(N^2).
Time Submitted | Status | Runtime | Memory | Language |
|---|---|---|---|---|
| 10/07/2020 11:53 | Accepted | 52 ms | 14.1 MB | cpp |
| 10/07/2020 11:51 | Wrong Answer | N/A | N/A | cpp |
| 09/07/2020 19:46 | Accepted | 56 ms | 14.3 MB | cpp |
| 09/07/2020 19:43 | Accepted | 68 ms | 15.2 MB | cpp |
| 09/07/2020 19:43 | Compile Error | N/A | N/A | cpp |
| 09/07/2020 19:42 | Accepted | 56 ms | 13.8 MB | cpp |
| 09/07/2020 19:35 | Wrong Answer | N/A | N/A | cpp |
xxxxxxxxxxclass Solution {public: int findCircleNum(vector<vector<int>>& isConnected) { vector<bool> visited(isConnected.size(), false); int res = 0; for(int i=0;i<isConnected.size();i++) { if(visited[i] == false) { dfs(isConnected, visited, i); res++; } } return res; } void dfs(vector<vector<int>>& isConnected, vector<bool>& visited, int i) { for(int j=0;j<isConnected.size();j++) { if(isConnected[i][j] == 1 && visited[j] == false) { visited[j] = true; dfs(isConnected, visited, j); } } }};